Clinically driven semi-supervised class discovery in gene expression data
نویسندگان
چکیده
MOTIVATION Unsupervised class discovery in gene expression data relies on the statistical signals in the data to exclusively drive the results. It is often the case, however, that one is interested in constraining the search space to respect certain biological prior knowledge while still allowing a flexible search within these boundaries. RESULTS We develop an approach to semi-supervised class discovery. One component of our approach uses clinical sample information to constrain the search space and guide the class discovery process to yield biologically relevant partitions. A second component consists of using known biological annotation of genes to drive the search, seeking partitions that manifest strong differential expression in specific sets of genes. We develop efficient algorithmics for these tasks, implementing both approaches and combinations thereof. We show that our method is robust enough to detect known clinical parameters in accordance with expected clinical values. We also use our method to elucidate cardiovascular disease (CVD) putative risk factors. AVAILABILITY MonoClaD (Monotone Class Discovery). See http:// bioinfo.cs.technion.ac.il/people/zohar/MonoClad/. SUPPLEMENTARY INFORMATION Supplementary data is available at http://bioinfo.cs.technion.ac.il/people/zohar/MonoClad/software. html
منابع مشابه
Semi-Supervised Self-Organizing Feature Map for Gene Classification
In this thesis, a study on gene expression data analysis is done using some supervised, unsupervised and semi-supervised approaches. The task of class prediction for six gene expression datasets (namely, Brain Tumor, Colon Cancer, Leukemia, Lymphoma and SRBCT) has been carried out. Here, a one-dimensional self-organizing feature maps (SOFM) in a semi-supervised learning framework is developed f...
متن کاملUn-Normalized Graph P-Laplacian Semi- Supervised Learning Method Applied to Cancer Classification Problem
A successful classification of different tumor types is essential for successful treatment of cancer. However, most prior cancer classification methods are clinical-based and have inadequate diagnostic ability. Cancer classification using gene expression data is very important in cancer diagnosis and drug discovery. The introduction of DNA microarray techniques has made simultaneous monitoring ...
متن کاملSemi-supervised learning via penalized mixture model with application to microarray sample classification
MOTIVATION It is biologically interesting to address whether human blood outgrowth endothelial cells (BOECs) belong to or are closer to large vessel endothelial cells (LVECs) or microvascular endothelial cells (MVECs) based on global expression profiling. An earlier analysis using a hierarchical clustering and a small set of genes suggested that BOECs seemed to be closer to MVECs. By taking adv...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملA Semi-supervised Algorithm for Pattern Discovery in Information Extraction from Textual Data
In this article we present a semi-supervised algorithm for pattern discovery in information extraction from textual data. The patterns that are discovered take the form of regular expressions that generate regular languages. We term our approach ‘semi-supervised’ because it requires significantly less effort to develop a training set than other approaches. From the training data our algorithm a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 24 16 شماره
صفحات -
تاریخ انتشار 2008